153 research outputs found

    A C-DAG task model for scheduling complex real-time tasks on heterogeneous platforms: preemption matters

    Full text link
    Recent commercial hardware platforms for embedded real-time systems feature heterogeneous processing units and computing accelerators on the same System-on-Chip. When designing complex real-time application for such architectures, the designer needs to make a number of difficult choices: on which processor should a certain task be implemented? Should a component be implemented in parallel or sequentially? These choices may have a great impact on feasibility, as the difference in the processor internal architectures impact on the tasks' execution time and preemption cost. To help the designer explore the wide space of design choices and tune the scheduling parameters, in this paper we propose a novel real-time application model, called C-DAG, specifically conceived for heterogeneous platforms. A C-DAG allows to specify alternative implementations of the same component of an application for different processing engines to be selected off-line, as well as conditional branches to model if-then-else statements to be selected at run-time. We also propose a schedulability analysis for the C-DAG model and a heuristic allocation algorithm so that all deadlines are respected. Our analysis takes into account the cost of preempting a task, which can be non-negligible on certain processors. We demonstrate the effectiveness of our approach on a large set of synthetic experiments by comparing with state of the art algorithms in the literature

    Work-in-Progress: NVIDIA GPU Scheduling Details in Virtualized Environments

    Get PDF
    Modern automotive grade embedded platforms feature high performance Graphics Processing Units (GPUs) to support the massively parallel processing power needed for next-generation autonomous driving applications. Hence, a GPU scheduling approach with strong Real-Time guarantees is needed. While previous research efforts focused on reverse engineering the GPU ecosystem in order to understand and control GPU scheduling on NVIDIA platforms, we provide an in depth explanation of the NVIDIA standard approach to GPU application scheduling on a Drive PX platform. Then, we discuss how a privileged scheduling server can be used to enforce arbitrary scheduling policies in a virtualized environment

    A Novel Real-Time Edge-Cloud Big Data Management and Analytics Framework for Smart Cities

    Get PDF
    Exposing city information to dynamic, distributed, powerful, scalable, and user-friendly big data systems is expected to enable the implementation of a wide range of new opportunities; however, the size, heterogeneity and geographical dispersion of data often makes it difficult to combine, analyze and consume them in a single system. In the context of the H2020 CLASS project, we describe an innovative framework aiming to facilitate the design of advanced big-data analytics workflows. The proposal covers the whole compute continuum, from edge to cloud, and relies on a well-organized distributed infrastructure exploiting: a) edge solutions with advanced computer vision technologies enabling the real-time generation of “rich” data from a vast array of sensor types; b) cloud data management techniques offering efficient storage, real-time querying and updating of the high-frequency incoming data at different granularity levels. We specifically focus on obstacle detection and tracking for edge processing, and consider a traffic density monitoring application, with hierarchical data aggregation features for cloud processing; the discussed techniques will constitute the groundwork enabling many further services. The tests are performed on the real use-case of the Modena Automotive Smart Area (MASA)

    Exploring the sequence length bottleneck in the Transformer for Image Captioning

    Full text link
    Most recent state of the art architectures rely on combinations and variations of three approaches: convolutional, recurrent and self-attentive methods. Our work attempts in laying the basis for a new research direction for sequence modeling based upon the idea of modifying the sequence length. In order to do that, we propose a new method called "Expansion Mechanism" which transforms either dynamically or statically the input sequence into a new one featuring a different sequence length. Furthermore, we introduce a novel architecture that exploits such method and achieves competitive performances on the MS-COCO 2014 data set, yielding 134.6 and 131.4 CIDEr-D on the Karpathy test split in the ensemble and single model configuration respectively and 130 CIDEr-D in the official online evaluation server, despite being neither recurrent nor fully attentive. At the same time we address the efficiency aspect in our design and introduce a convenient training strategy suitable for most computational resources in contrast to the standard one. Source code is available at https://github.com/jchenghu/explorin

    A Perspective on Safety and Real-Time Issues for GPU Accelerated ADAS

    Get PDF
    The current trend in designing Advanced Driving Assistance System (ADAS) is to enhance their computing power by using modern multi/many core accelerators. For many critical applications such as pedestrian detection, line following, and path planning the Graphic Processing Unit (GPU) is the most popular choice for obtaining orders of magnitude increases in performance at modest power consumption. This is made possible by exploiting the general purpose nature of today's GPUs, as such devices are known to express unprecedented performance per watt on generic embarrassingly parallel workloads (as opposed of just graphical rendering, as GPUs where only designed to sustain in previous generations). In this work, we explore novel challenges that system engineers have to face in terms of real-time constraints and functional safety when the GPU is the chosen accelerator. More specifically, we investigate how much of the adopted safety standards currently applied for traditional platforms can be translated to a GPU accelerated platform used in critical scenarios

    API Comparison of CPU-To-GPU Command Offloading Latency on Embedded Platforms (Artifact)

    Get PDF
    High-performance heterogeneous embedded platforms allow offloading of parallel workloads to an integrated accelerator, such as General Purpose-Graphic Processing Units (GP-GPUs). A time-predictable characterization of task submission is a must in real-time applications. We provide a profiler of the time spent by the CPU for submitting stereotypical GP-GPU workload shaped as a Deep Neural Network of parameterized complexity. The submission is performed using the latest API available: NVIDIA CUDA, including its various techniques, and Vulkan. Complete automation for the test on Jetson Xavier is also provided by scripts that install software dependencies, run the experiments, and collect results in a PDF report

    Novel Methodologies for Predictable CPU-To-GPU Command Offloading

    Get PDF
    There is an increasing industrial and academic interest towards a more predictable characterization of real-time tasks on high-performance heterogeneous embedded platforms, where a host system offloads parallel workloads to an integrated accelerator, such as General Purpose-Graphic Processing Units (GP-GPUs). In this paper, we analyze an important aspect that has not yet been considered in the real-time literature, and that may significantly affect real-time performance if not properly treated, i.e., the time spent by the CPU for submitting GP-GPU operations. We will show that the impact of CPU-to-GPU kernel submissions may be indeed relevant for typical real-time workloads, and that it should be properly factored in when deriving an integrated schedulability analysis for the considered platforms. This is the case when an application is composed of many small and consecutive GPU compute/copy operations. While existing techniques mitigate this issue by batching kernel calls into a reduced number of persistent kernel invocations, in this work we present and evaluate three other approaches that are made possible by recently released versions of the NVIDIA CUDA GP-GPU API, and by Vulkan, a novel open standard GPU API that allows an improved control of GPU command submissions. We will show that this added control may significantly improve the application performance and predictability due to a substantial reduction in CPU-to-GPU driver interactions, making Vulkan an interesting candidate for becoming the state-of-the-art API for heterogeneous Real-Time systems. Our findings are evaluated on a latest generation NVIDIA Jetson AGX Xavier embedded board, executing typical workloads involving Deep Neural Networks of parameterized complexity

    Environmental conditions in river segments intercepted by culverts

    Get PDF
    The conservation and maintenance of the quality of the rheophilic environment are directly related to knowledge of the physical and chemical characteristics and structural patterns of these systems, especially in streams. Long stretches of small water bodies are highly altered by the construction of highways and roads, which tend to modify their natural characteristics, affecting the environmental quality. This study describes vegetation and morphogeometric parameters of streams with culverts along their courses, reporting spatial differences in environmental characteristics (vegetation, morphogeometric, physical, and chemical) between sampling points upstream and downstream of the culvert. Specifically, we evaluated the width, depth, riparian vegetation, substrate background, and physical and chemical properties of the water, to identify possible differences between the sections above and below (upstream and downstream) of the culvert. The rapid assessment protocol (RAP) was applied to stretches of 200 meters upstream and downstream of culverts in two Neotropical streams, between the months of November 2009 and October 2010. The vegetation and morphogeometric attributes differed between the portions upstream and downstream of the culverts, the latter because of the impoundment effect of these structures. The upstream section becomes flooded, is often shallow, and directly influences the movement of sediment. The physical and chemical variables of the water showed no spatial variation.(Condições ambientais de segmentos fluviais interceptados por bueiros). A conservação e a manutenção da qualidade ambiental do ambiente reofílico está diretamente relacionada ao conhecimento de características físicas e químicas e dos padrões estruturais destes sistemas, especialmente em riachos. Longos trechos de pequenos corpos aquáticos são altamente alterados pela construção de rodovias e estradas e tende a modificar as suas características naturais, interferindo na qualidade ambiental. Neste sentido, o objetivo deste estudo foi descrever parâmetros fito-morfogeométricos de riachos com bueiros em seu curso longitudinal, reportando diferenças espaciais nas características ambientais (fito-morfogeométricos e físico-químicas) entre os pontos amostrados (montante e jusante do bueiro). Especificamente, avaliamos a largura, a profundidade, vegetação ripária, substrato de fundo e atributos físicos e químicos da água, verificando as possíveis divergências entre os trechos de acima e abaixo (montante e jusante) do bueiro. Para isso, o protocolo de avaliação rápida (PAR) foi aplicado em trechos de 200 metros a montante, bem como a jusante de bueiros em dois riachos neotropicais entre os meses de novembro de 2009 e outubro de 2010. Verificou-se que os atributos fito-morfogeométricos diferem entre os trechos de montante e jusante, pois o bueiro tem efeito de represamento. Esse fato transforma o trecho a montante em ambiente alagado, muitas vezes rasos e influenciando diretamente o movimento de sedimentos. As variáveis físicas e químicas da água não apresentaram variação espacial

    Morphology change in nematic membranes induced by defects

    Full text link
    The cell membrane is one of the most important structures of living organisms. This is due to the many functions attributed to it such as permeable selectivity, protection, anchoring to the cytoskeleton and so many others. Any change in the shape of the cell membrane may affect directly the properties and abilities. In this article, we study how defects in the liquid crystalline organization of a membrane can change its shape.  For performing this, we consider a membrane with orientational order, i.e., a nematic membrane, which can happen in biological membranes, nematic films and other systems and study how a defect in this order can change the shape of the membrane when the bending rigidity is considered. We find that depending on the ratio of rigidity and elastic constant, buckling of this membrane may happens and turn it into pseudo-spheres
    • …
    corecore